consistency condition
Visual Prompt Tuning in Null Space for Continual Learning Yue Lu
To be concrete, we firstly take the full self-attention and LayerNorm into consideration and derive a strict condition for eliminating the interference through a comprehensive analysis of the forward propagation of the ViT layer. Then we further propose to convert the condition of self-attention into its two sufficient conditions, which enables us to address the challenge of high order and nonlinearity.
- Research Report > Experimental Study (0.93)
- Research Report > New Finding (0.68)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Natural Language (0.93)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Forgetting is Everywhere
Sanati, Ben, Lee, Thomas L., McInroe, Trevor, Scannell, Aidan, Malkin, Nikolay, Abel, David, Storkey, Amos
A fundamental challenge in developing general learning algorithms is their tendency to forget past knowledge when adapting to new data. Addressing this problem requires a principled understanding of forgetting; yet, despite decades of study, no unified definition has emerged that provides insights into the underlying dynamics of learning. We propose an algorithm- and task-agnostic theory that characterises forgetting as a lack of self-consistency in a learner's predictive distribution over future experiences, manifesting as a loss of predictive information. Our theory naturally yields a general measure of an algorithm's propensity to forget. To validate the theory, we design a comprehensive set of experiments that span classification, regression, generative modelling, and reinforcement learning. We empirically demonstrate how forgetting is present across all learning settings and plays a significant role in determining learning efficiency. Together, these results establish a principled understanding of forgetting and lay the foundation for analysing and improving the information retention capabilities of general learning algorithms.
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
- North America (0.14)
- Asia > China > Shaanxi Province > Xi'an (0.04)
- Research Report > Experimental Study (0.93)
- Research Report > New Finding (0.68)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Natural Language (0.93)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Group-Relative REINFORCE Is Secretly an Off-Policy Algorithm: Demystifying Some Myths About GRPO and Its Friends
Yao, Chaorui, Chen, Yanxi, Sun, Yuchang, Chen, Yushuo, Zhang, Wenhao, Pan, Xuchen, Li, Yaliang, Ding, Bolin
The past few years have witnessed rapid progress in reinforcement learning (RL) for large language models (LLMs). This began with reinforcement learning from human feedback (RLHF) [Bai et al., 2022, Ouyang et al., 2022] that aligns pre-trained LLMs with human preferences, followed by reasoning-oriented RL that enables LLMs to produce long chains of thought [OpenAI, 2024, DeepSeek-AI, 2025, Kimi-Team, 2025b, Zhang et al., 2025b]. More recently, agentic RL [Kimi-Team, 2025a, Gao et al., 2025, Zhang et al., 2025a] aims to train LLMs for agentic capabilities such as tool use, long-horizon planning, and multi-step task execution in dynamic environments. Alongside these developments, off-policy RL has been attracting growing interest. In the "era of experience" [Silver and Sutton, 2025], LLM-powered agents need to be continually updated through interaction with the environment. Practical constraints in real-world deployment and the complexity of LLM-RL infrastructure often render on-policy training impractical [Noukhovitch et al., 2025]: rollout generation and model training can proceed at mismatched speeds, data might be collected from different policies, reward feedback might be irregular or delayed, and the environment may be too costly or unstable to query for fresh trajectories.
- Asia > China > Jiangsu Province > Yancheng (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- Europe > Spain > Valencian Community > Valencia Province > Valencia (0.04)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Macroeconomic Foundation of Monetary Accounting by Diagrams of Categorical Universals
Menéndez, Renée, Winschel, Viktor
We present a category theoretical formulation of the Monetary Macroeconomic Accounting Theory (MoMaT) of Menéndez and Winschel [2025]. We take macroeconomic (national) accounting systems to be composed from microeconomic double-entry systems with real and monetary units of accounts. Category theory is the compositional grammar and module system of mathematics which we use to lift micro accounting consistency to the macro level. The main function of money in MoMaT is for the repayment of loans and not for the exchange of goods, bridging the desynchronisation of input and output payments of producers. Accordingly, temporal accounting consistency is at the macroeconomic level. We show that the accounting for macroeconomies organised by a division of labor can be consistent and stable as a prerequisite for risk and GDP sharing of societies. We exemplify the theory by five sectoral agents of Labor and Resource owners, a Company as the productive sector, a Capitalist for profits, and a Bank as the financial sector providing loans to synchronise the micro and the macro levels of an economy. The dynamics is described by eight sectoral macroeconomic bookings in each period demonstrating stable convergence of the MoMaT in numerical simulations. The categorical program implements a consistent evolution of hierarchical loan repayment contracts by an endofunctor. The universal constructions of a limit verify all constraints as the sectoral investment and learning function at the macroeconomic level. The dual colimit computes the aggregated informations at the macro level as usual in the mathematics of transitions from local to global structures. We use visual diagrams to make complex economic relationships intuitive. This paper is meant to map economic to categorical concepts to enable interdisciplinary collaboration for digital twins of monetary accounting systems.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.13)
- Asia > Middle East > Iran > Arabian Gulf (0.04)
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- (2 more...)
- Banking & Finance > Trading (1.00)
- Banking & Finance > Loans (1.00)
- Banking & Finance > Economy (1.00)
- Government > Regional Government (0.92)
Set-Rationalizable Choice and Self-Stability
Brandt, Felix, Harrenstein, Paul
A common assumption in modern microeconomic theory is that choice should be rationalizable via a binary preference relation, which \citeauthor{Sen71a} showed to be equivalent to two consistency conditions, namely $α$ (contraction) and $γ$ (expansion). Within the context of \emph{social} choice, however, rationalizability and similar notions of consistency have proved to be highly problematic, as witnessed by a range of impossibility results, among which Arrow's is the most prominent. Since choice functions select \emph{sets} of alternatives rather than single alternatives, we propose to rationalize choice functions by preference relations over sets (set-rationalizability). We also introduce two consistency conditions, $\hatα$ and $\hatγ$, which are defined in analogy to $α$ and $γ$, and find that a choice function is set-rationalizable if and only if it satisfies $\hatα$. Moreover, a choice function satisfies $\hatα$ and $\hatγ$ if and only if it is \emph{self-stable}, a new concept based on earlier work by \citeauthor{Dutt88a}. The class of self-stable social choice functions contains a number of appealing Condorcet extensions such as the minimal covering set and the essential set.
- North America > United States > Michigan (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Germany (0.04)
Standardization of Weighted Ranking Correlation Coefficients
A relevant problem in statistics is defining the correlation of two rankings of a list of items. Kendall's tau and Spearman's rho are two well established correlation coefficients, characterized by a symmetric form that ensures zero expected value between two pairs of rankings randomly chosen with uniform probability. However, in recent years, several weighted versions of the original Spearman and Kendall coefficients have emerged that take into account the greater importance of top ranks compared to low ranks, which is common in many contexts. The weighting schemes break the symmetry, causing a non-zero expected value between two random rankings. This issue is very relevant, as it undermines the concept of uncorrelation between rankings. In this paper, we address this problem by proposing a standardization function $g(x)$ that maps a correlation ranking coefficient $\Gamma$ in a standard form $g(\Gamma)$ that has zero expected value, while maintaining the relevant statistical properties of $\Gamma$.
A General Framework for Constraint-based Causal Learning
Teh, Kai Z., Sadeghi, Kayvan, Soo, Terry
By representing any constraint-based causal learning algorithm via a placeholder property, we decompose the correctness condition into a part relating the distribution and the true causal graph, and a part that depends solely on the distribution. This provides a general framework to obtain correctness conditions for causal learning, and has the following implications. We provide exact correctness conditions for the PC algorithm, which are then related to correctness conditions of some other existing causal discovery algorithms. We show that the sparsest Markov representation condition is the weakest correctness condition resulting from existing notions of minimality for maximal ancestral graphs and directed acyclic graphs. We also reason that additional knowledge than just Pearl-minimality is necessary for causal learning beyond faithfulness.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > United Kingdom > England > Greater London > London (0.04)
- (2 more...)
Unified continuous-time q-learning for mean-field game and mean-field control problems
Wei, Xiaoli, Yu, Xiang, Yuan, Fengyi
This paper studies the continuous-time q-learning in the mean-field jump-diffusion models from the representative agent's perspective. To overcome the challenge when the population distribution may not be directly observable, we introduce the integrated q-function in decoupled form (decoupled Iq-function) and establish its martingale characterization together with the value function, which provides a unified policy evaluation rule for both mean-field game (MFG) and mean-field control (MFC) problems. Moreover, depending on the task to solve the MFG or MFC problem, we can employ the decoupled Iq-function by different means to learn the mean-field equilibrium policy or the mean-field optimal policy respectively. As a result, we devise a unified q-learning algorithm for both MFG and MFC problems by utilizing all test policies stemming from the mean-field interactions. For several examples in the jump-diffusion setting, within and beyond the LQ framework, we can obtain the exact parameterization of the decoupled Iq-functions and the value functions, and illustrate our algorithm from the representative agent's perspective with satisfactory performance.
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
- Asia > China > Hong Kong > Kowloon (0.04)
- Asia > China > Heilongjiang Province > Harbin (0.04)
- Research Report (0.69)
- Overview (0.47)
Consistency Models Made Easy
Geng, Zhengyang, Pokle, Ashwini, Luo, William, Lin, Justin, Kolter, J. Zico
Consistency models (CMs) are an emerging class of generative models that offer faster sampling than traditional diffusion models. CMs enforce that all points along a sampling trajectory are mapped to the same initial point. But this target leads to resource-intensive training: for example, as of 2024, training a SoTA CM on CIFAR-10 takes one week on 8 GPUs. In this work, we propose an alternative scheme for training CMs, vastly improving the efficiency of building such models. Specifically, by expressing CM trajectories via a particular differential equation, we argue that diffusion models can be viewed as a special case of CMs with a specific discretization. We can thus fine-tune a consistency model starting from a pre-trained diffusion model and progressively approximate the full consistency condition to stronger degrees over the training process. Our resulting method, which we term Easy Consistency Tuning (ECT), achieves vastly improved training times while indeed improving upon the quality of previous methods: for example, ECT achieves a 2-step FID of 2.73 on CIFAR10 within 1 hour on a single A100 GPU, matching Consistency Distillation trained of hundreds of GPU hours. Owing to this computational efficiency, we investigate the scaling law of CMs under ECT, showing that they seem to obey classic power law scaling, hinting at their ability to improve efficiency and performance at larger scales. Code (https://github.com/locuslab/ect) is available.